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ACETYL-COA-CARBOXYLASE FROM CANDIDA ALBICANS 



The present invention relates to Acetyl-COA-carboxylase (ACCase) genes from 
Candida Albicans (C. albicans) and methods for its expression. The invention also relates to 
5 novel hybrid organisms for use in such expression methods. 

C. albicans is an important fungal pathogen and the most prominent target organism for 
antifungal research. ACCase is an enzyme of fatty acid biosynthesis and essential for fungal 
growth and viability. Inhibitors of the ACCase enzyme should therefore be potent antifungals. 
The ACCase proteins in all organisms are homologous to each other but they also differ 
10 significantly in the amino acid sequence. Because selectivity problems (for example fungal 
versus human) it is extremely important to optimise potential inhibitor leads directly against the 
target enzyme (C. albicans) and not against a homologous but non-identical model protein, for 
example from Saccharomyces cerevisiae (S. Cerevisiae). 

We have now successfully cloned the ACCase gene from C. albicans (hereinafter 
15 referred to as the C. Albicans ACC1 gene) and elucidated its full length DNA sequence and 
corresponding polypeptide sequence, as set out in Figures 4 and 5 of this application 
respectively. The coding DNA sequence of the C. Albicans ACC1 gene is 6810 nucleotides in 
length and the corresponding protein sequence is 2270 amino acids in length. As will be 
explained below there are two forms of the C. Albicans ACC1 gene, the above numbers relate 
20 to the longer version, Metl . 

Therefore in a first aspect of the present invention we provide a polynucleotide 
encoding a C.albicans ACCase gene, in particular the (purified) C. albicans ACC1 gene as set 
out in Figure 5 hereinafter. It will be appreciated that the polynucleotide may comprise any of 
the degenerate codes for a particular amino acid including the use of rare codons. The 
25 polynucleotide is conveniently as set out in Figure 4. It will be apparent from Figure 4 that the 
gene is characterised by the start codons Metl and Met2 (as indicated by the first and second 
underlined atg codons, hereinafter referred to as atgl and atg2 respectively). Bom forms of the 
gene starting from Metl and Met2 respectively are comprised in the present invention. The 
invention further comprises convenient fragments of any one of the above sequences. 
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Convenient fragments may be defined by restriction endonuclease digests of sequence, suitable 
fragments include a full length C. Albicans ACC1 gene (starting with Metl or Met2) flanked by 
unique StuI (S'-end)-Notl (3'-end) restriction sites as detailed in Figure 6. 

We also provide a polynucleotide probe comprising any one of the above sequences or 
5 fragments together with a convenient label or marker, preferably a non-radioactive label or 
marker. Following procedures well known in the art, the probe may be used to identify 
corresponding nucleic acid sequences. Such sequences may be comprised in libraries, such as 
cDNA libraries. We also provide RNA transcripts corresponding to any of the above C. 
Albicans ACCl sequences or fragments. 
10 In a further aspect of the invention we provide a C. albicans ACC 1 enzyme, especially 

the ACCl enzyme having the polypeptide sequence set out in Figure 5, in isolated and purified 
form. This is conveniently achieved by expression of the coding DNA sequence of the C. 
Albicans ACCl gene set out in Figure 4, using methods well known in the art (for example as 
described in the Maniatis cloning manual - Molecular Cloning: A Laboratory Manual, 2 nd 
15 Edition 1989, J. Sambrook, E.F. Fritsch & Maniatis). As indicated for Figure 4 above, the 
enzyme is characterised by two fornis Metl and Metl Both form of the enzyme are comprised 
in the present invention. 

The C. Albicans ACCl enzyme of the present invention is usefiil as a target in 
biochemical assays. However, to provide sufficient enzyme for a biochemical assay for C. 
20 Albicans ACCl (for example, for a high throughput screen for enzyme inhibitors) this has to be 
purified. Two major constraints impair this purification. 

1) any new organism will necessitate deviation from published procedures because it will differ 
in its lysis and protease activity. C. albicans is known to express and secrete many aspartyl 
proteases. 

15 2) The expression of C. Albicans ACC 1 is very low and satisfying purification results can only 
be achieved if the enzyme is overexpressed. 

We have now been able to overcome these problems by controlled overexpression of the 
C. albicans ACCl in a Saccharomyces strain. This means that subsequent purification of the 
enzyme may then for example follow published procedures. 
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Therefore in a further aspect of the present invention we provide a novel expression 
system for expression of a C. albicans ACC1 gene which system comprises an S. cerevisiae 
host strain having a C. albicans ACC1 gene inserted in place of the native ACC1 gene from S. 
Cerevisiae, whereby the C. albicans ACC1 gene is expressed. Preferred S. cerevisiae strains 
5 include.JK9-3Daa and its haploid segregants. 

The C. albicans ACC1 gene is preferably over-expressed relative to that as may be 
achieved by a C. albicans wild type strain, ie under the control of its own ACC1 promoter. 
Whilst we do not wish to be bound by theoretical considerations, we have achieved 
approximately 14 fold over-expression relative to the wild-type host S. cervisiae strain JK9-3D. 
1 0 This may be achieved by replacing the C. albicans promoter in the expression construct by a 
stronger and preferably inducible promoter such as the S. cerevisiae GAL1 promoter. 

Controlled overexpression is used to improve expression of a C. albicans polypeptide 
relative to expression under the control of a C. albicans promoter. In addition using procedures 
outlined in the accompanying examples we have been able to isolate a fully functional C. 
15 albicans ACC1 gene as determined by 100% inhibition by SoraphenA. 

The novel expression system is conveniendy prepared by transformation of a 
heterozygous ACC1 deletion strain of a convenient S. cerevisiae host by a convenient plasmid 
• comprising the C. albicans ACC1 gene. Transformation is conveniently effected using methods 
well known in the art of molecular biology (Ito et al. 1983). 
20 The plasmid comprising the C. albicans ACC1 gene and used to transform a convenient 

S. cerevisiae host represents a further aspect of the invention. Preferred plasmids for insertion 
of the C. Albicans ACC1 gene include YEp24, pRS3 1 6 and pYES2(Invitrogen). 

The heterozygous ACC1 deletion strain of a convenient (diploid) S. cerevisiae host is 
conveniently achieved by disruption preferably using an antibiotic resistance cassette such as 
25 the kanamycin resistance cassette described by Wach et al (Yeast, 1 994, 10, 1 793- 1 808). 

The expression systems of the invention may be used together with, for example cell 
growth and enzyme isolation procedures identical to or analogous to those described herein, to 
provide an acetyl-COA-carboxylase (ACCase) gene from Calbicans in sufficient quantity and 
with sufficient activity for compound screening purposes. 
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In a further aspect of the invention we provide the use of an acetyl-COA-carboxylase 
(ACCase) gene from C.albicans in assays to identify inhibitors of the polypeptide. In particular 
we provide the their use in pharmaceutical or agrochemical research. 

As presented above the C. albicans ACC1 enzyme may be used in biochemical assays to 
5 identify agents which modulate the activity of the enzyme. The design and implementation of 
such assays will be evident to the biochemist of ordinary skill. The enzyme may be used to 
turn over a convenient substrate whilst incorporating/losing a labelled component to define a 
test system. Test compounds are then introduced into the test system and measurements made 
to determine their effect on enzyme activity. Particular assays are those used to identify 
10 inhibitors of the enzyme useful as antifungal agents. By way of non-limiting example, the 
activity of the ACC1 enzyme may be determined by (i) following the incorporation (HC0 3 , 
Acetyl-CoA) or loss (ATP) of a convenient label from the relevant substrate (T.Tanabe et al, 
Methods in Enzymology, 1981, 71, 5-60; M. Matasuhashi, Methods in Enzymology, 1969, 14, 
3-16), (ii) following the release of inorganic phosphate from ATP (P. Lanzetta et al, Anal. 
15 Biochem. 1979, 100, 95-97), or (iii) following the oxidation of NADH in a coupled assay, for 
example using either fatty acid synthetase or pyruvate kinase/lactate dehydrogenase enzymes. 
Convenient labels include carbonH, tritium, phosphorous32 or 33. 

Any convenient test compound(s) or library of test compounds may be used. Particular 
test compounds include low molecular weight chemical compounds (molecular weight less than 
20 1500 daltons) suitable as pharmaceutical agents for human, animal or plant use. 

The enzyme of the invention, and convenient fragments thereof may be used to raise 
antibodies. Such antibodies have a number of uses which will be evident to the molecular 
biologist of ordinary skill. Such uses include (i) monitoring enzyme expression, (ii) the 
development of assays to measure enzyme activity and precipitation of the enzyme. 
25 In addition we provide antisense polynucleotides specific for all or a part of an ACC1 

polynucleotide of the invention. 

The invention will now be illustrated but not limited by reference to the following 
Table, Example, References and Figures wherein: 
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labjej. shows the comparative properties of native and recombinant acetyl-CoA 
carboxylase enzymes 

F'Kure 1 shows partial sequence from the C. albicans genome. Underlined regions 
5 were used to derive PCR primers, to generate a C. albicans ACC 1 specific probe. 

Fjgure2 shows cloned fragments of the C. albicans ACC1 gene isolated from genomic 
DNA libraries. Arrows indicate extension of the fragment beyond the region displayed. 
Figurei shows sequenced Xbal-HinDIII and HinDIII subclones of clone CLSl-bl. 
Figure4 shows the full DNA sequence of the C. albicans ACC1 gene. The atg start 
1 0 codons for Metl and Met 2 are in lower case and underlined, as is the tag stop codon. 

Figurei shows the full protein sequence of the C. albicans ACC1 gene. Putative start 
codons for Metl and Met2 are shown in bold. 

Figure 6 shows the generation of a tailored ACC1 gene (minus promoter) for 
expression under control of the GAL1 promoter in plasmid pYES2. From the initial ACCase 
1 5 gene (linel ) the core SacI-BamHI (line3) is modified by the addition of 3' BamHI- NotI 
(line2) and 5' StuI-SacI (different fragments for Metl and -2 lines 5 and 7 respectively) to 
generate the final "portable" gene flanked by StuI-NotI (lines 6 and 8). 

Figure 7 shows the results of the in-vitro ACCase en2yme assay set out in the 
accompanying Example when Soraphen A (a specific inhibitor of the ACCase enzyme) was 
20 supplied (X-axis) over the range O.lnM-lOOuM in the dose response regimen of the assay. 

Example 1 

Cloning of the C. albicans ACC1 gene and generation of a heterologous S. cerevisiae 
expression system: 

25 

1) Probe generation 

We used the polymerase chain reaction (PCR) to generate a DNA probe between and 
including the underlined regions in Figure 1 
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2) Identification of clones from a C. albicans genomic library hybridising to the ACCase 
probe 

The PCR product was labelled using an "ECL direct nucleic acid labelling and 
detection kit" (Amersham) as described by the supplier. The PCR product (probe) was then 
5 shown to hybridise to S. cerevisiae (weakly) and C. albicans genomic DNA. in a Southern 
blot procedure (as described Maniatis, 1989). Two genomic DNA libraries (CLS1 and CLS2) 
of C. albicans (in the yeast-E. coli shuttle plasmids YEp24 and pRS316 respectively, (as 
described in.Sherlock et al. 1994, source: Prof. John Rosamond, Manchester University) were 
used to isolate fragments hybridising with the probe which was radiolabelled using "Ready To 

10 Go" dCTP labelling beads (Pharmacia, as described by the manufacturer). The colony 
hybridisation was carried out as described by Maniatis (1989). Hybridising colonies were 
identified, plasmid DNA isolated, purified (Quiagen maxiprep, as described by the supplier) 
and sequenced (Applied Biosystems, model 377 sqeuencer) from their junctions with the 
plasmid. Several fragments carrying partial ACCase gene sequence as well as one full length 

15 clone could be identified (Figure 2). 

3) Sequencing of the cloned gene, comparison with ACCases from S. cerevisiae, other 
fungi and higher eukaryotes (plants, mammals, man) 

The bulk of the sequence of the C. albicans ACC1 gene was determined (on both 

20 strands) using flanking sequence- or insert sequence-specific primers from defined HinDIH 
and Xbal-HinDm subfragments (of clone CLSl-bl) cloned into pUC19 (see Figure 4). The 
promoter and 5' coding region absent from this clone was established from CLS2-dl and the 
gene's 3' end from CLS2-13 using insert specific primers. All junctions including the ones 
between the HinDIII subfragments were verified from the full length clone CLS2-13 (in 

25 Yep24. The full length DNA sequence of C. albicans (Ca) ACC1 is shown in Figure 5a and 
the protein translation in Figure 5b. The two potential start Methionines, Metl and Met2 are 
shown in bold 

The protein is homologous to ACCases of other fungi (S. cerevisiae, S. pombe and 
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U maydis) and also to the plant (Brassica napus), mammalian (sheep, chicken and rat) and 
human enzymes. Of the two potential start codons of C. albicans ACC1, Met 2 seems the 
more likely one as the sequence between Metl and Met2 is unrelated to the other ACCases 
and indeed to any other protein sequence in the EMBL/Genbank database. The high degree of 
5 homology between ACCases of different species and the apparent lack of an identifiable 
fungal subgroup makes it even more important to use the actual target enzyme (here from the 
pathogen C. albicans) as a screening tool to identify specific inhibitors. 



4) Generation of a heterozygous ACC1 deletion strain of S. cerevisiae 

10 As ACCase is an essential enzyme, only one allele of a diploid cell can be deleted 

without loss of survival. One ACC1 gene of a diploid S. cerevisiae strain (JK9-3Daa, Kunz et 
al. 1993) was therefore disrupted using the kanamycin resistance cassette as described by 
Wach et al. using the protocol described therein. Sporulation of the heterozygous diploid 
(ACCl/accl::KANMx) yields only two viable spores (which are kanamycin-sensitive) 

15 showing the essentiality of the ACC1 gene as well as the characteristic arrest phenotype for 
the two inviable spores (as published by Hafllacher et al., 1993). 



5) Complementation of a S. cerevisiae ACC1 deletion with the cloned Candida gene, 
CaACCl 

The heterozygous ACCl/accl::KANMx strain was transformed with one full length C. 
albicans gene (CLS2-13 in Yep24). Expression of the gene from this plasmid will be due to 
functionality of the Candida ACC1 promoter in the heterologous S. cerevisiae system. 
Complementation of the knockout was demonstrated by sporulating the diploid transfonnants. 
In most cases 3-4 viable (haploid) spores were detected. The analysis of tetrads indicated that 
kanamycin-resistant colonies were only formed if they also contained the complementing 
CLS2-13 plasmid, as indicated by the presence of the URA3 transformation marker. This 
clearly shows that the C. albicans gene fully complements the ACCase function in S. 
cerevisiae. Therefore the strain generated can be used to screen for inhibitors which are 
specific for the Candida enzyme in the absence of a background of Saccharomyces enzyme. 
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As demonstrated by its functionality, the heterologous protein folds correctly in the host, S. 
cerevisiae, where it must also have been correctly biotinylated by the S. cerevisiae machinery 
(carried out by ACC2, encoding protein-biotin-ligase). 

To facilititate purification of C. albicans ACCase, it is beneficial to achieve 
5 overexpression of the protein in a suitable host. Therefore the C. albicans promoter was 
replaced by the stronger and inducible S. cerevisiae GAL1 promoter. As the Candida 
sequence had revealed two potential start codons (see Figure 4) for the ACC1 reading frame, 
both versions were placed under GAL1 control. To generate appropriate restriction sites for 
cloning, the ACC1 gene was modified via PCR at both ends (see Figure 6 above), and cloned 

1 0 into plasmid pYES2 (Invitrogen) as a StuI-NotI fragment into HinDIII (fill-in)-Notl sites of 
the vector. The identity of the PCR-modified gene-parts with the original ones was confirmed 
by sequencing. Both constructs (Metl and Met2) complement the S. cerevisiae ACC1 
knockout when the cells are grown on galactose but not on glucose (where the GAL1 
promoter is switched off). Growth is very poor if the gene is transcribed initiating at Metl, 

15 whereas Met2 restores wild type growth rates in S.cerevisiae. 

6) Overexpression of the Ca ACCase to facilitate protein purification and use for 
screening purposes 



20 Materials 

Growth Media :- 

Sabouraud Dextrose broth 

Yeast peptone dextrose broth (YPD) 

Yeast peptone galactose broth (YPGal) (i.e. 2% w/v galactose) 

25 

Growth of cells 

Candida albicans B2630 (Janssen Pharmaceutica, Beerse, Belgium) was maintained 
on Sabouraud dextrose agar slopes at 37 <>C which were subcultured biweekly. For the 
growth of liquid cultures for experiments, C. albicans grown on Sabourauds dextrose agar for 
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48 h at 37<>C was used to inoculate 50 ml Sabouraud dextrose broth containing 500ug/l d- 
biotin. This was incubated for 16 h at 37 <>C on a platform shaker (150 rpm). 1.5 ml of this 
culture was added to each of 24 x2 litre conical flasks, each containing 1 litre of Sabouraud 
dextrose broth containing 500ug/l d-biotin, giving a final inoculum concentration of 
5 approximately 1 .5x1 0^ cfu ml-'. The cultures were grown for 9 h, at 37 <>C (log phase) with 
shaking (150 rpm). Cell numbers in liquid culture were determined spectrophotometrically 
(Philips PU8630 UV/VIS/NIR Spectrophotometer) at 540 nm in a 3 cm path length cuvette. 
Absorbance was linearly related to cell number up to an OD. of 2.0. 

Saccharomyces cerevisiae strains Meyl34 and CLS2-13 were maintained on Yeast 
1 0 peptone dextrose (YPD) agar plates at 30 (>C, which were subcultured biweekly. For the 
growth of liquid cultures for experiments, the S. cerevisiae strains were grown on YPD agar 
for 48 h at 30 °C and were then used to inoculate 50 ml YPD broth containing 500ug/l d- 
biotin, which was incubated at 300c for 16h on a platform shaker (200 rpm). 2.0 ml of this 
culture (approx. 4 x 10 8 cfu/ml) was added to each of 24 x 2 litre conical flasks, each 
15 containing 1 litre of YPD broth containing 500ug/l d-biotin, giving a final inoculum 

concentration of approximately 8 xl05 cfu/ml. The cultures were grown for 9 h, at 30 <>C (log 
phase) with shaking (200 rpm). Cell numbers in liquid culture were determined 
spectrophotometrically (Philips PU8630 UV/VIS/NIR Spectrophotometer) at 540 nm in a 1 
cm path length cuvette. 

20 Saccharomyces cerevisiae strains PNS1 17a 5C, PNS1 17b 6A, and PNS 120a 6C were 

maintained on Yeast peptone galactose (YPGal) agar plates at 30 °C which were subcultured 
biweekly. For the growth of liquid cultures for experiments, the S. cerevisiae strains were 
grown on YPGal agar for 48 h at 30 °C and were then used to inoculate 50 ml YPGal broth 
containing 500ug/l d-biotin and 200ug/ml kanomycin, which were incubated at 30<>C for 30h 

25 on a platform shaker (200 rpm). 2.0 ml of this culture (approx. 4 x 10* cfu/ml) was added to 
each of 24 x2 litre conical flasks, each containing 1 litre of YPGal broth containing 500ug/l 
d-biotin and 200ug/ml kanomycin, giving a final inoculum concentration of approximately 8 
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xl 0 5 cfu/ml. The cultures were grown for approximately 23h at 30 °C (log phase) wife 
shaking (200 rpm). 

Determination of cell number 

5 Cell numbers were determined using a standard viable count agar based plating 

method, using the appropriate agar media. 

Preparation of fungal ACCase enzvme 

Cultures of the appropriate yeast strains were grown to the exponential phase of 
10 growth (for Saccharomyces and Candida strains respectively). These were then harvested 
by centrifugation (4400 g, lOmin, 4 0q, washed twice in 700ml of 50mM Tris pH7.5 
containing 20% w/v gylcerol, resuspending the cell pellet each time. The final washed pellet 
was fully resuspended into a thick slurry using 10 to 20ml of buffer (50mM Tris pH7.5 
containing ImM EGTA, ImM EDTA (disodium salt), ImM DTT, 0.25mM Pefabloc 
15 hydrochloride, luM Leupepth hemisulphate, luM Pepstatin A, 0.5uM Trypsin inhibitor 
and 20% w/v glycerol). The volume of buffer required was dependent on the total packed 
cell wet weight, (i.e. 1ml buffer added per 6gm of packed wet cell pellet). 

The cell paste was homogenised using a pre-cooled Bead-Beater (Biospec 
Products,Bartlesville, OK 74005) with 4 x 10 second Bursts, allowing 20 second intervals on 
20 ice. The preparation was then centrifuged at 31,180g for 30 minutes. After centrifugation 
the supernatant was immediately decanted into a container, then aliquoted before snap 
freezing in liquid nitrogen. The preparation was then stored at -80°C and was found to be 
stable for at least 2 months. 

All enzyme preparation steps were carried out at +4°C, unless otherwise stated. 

25 

In-vitro ACCase enzvme assay 

The assay was conducted in 96 well, flat bottomed polystyrene microtitre plates. All 
test and control samples were tested in duplicate in this assay. 
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lOOul of the ACCase enzyme preparation (in 50mM Tris pH7.5 containing lmM 
EGTA, lmM EDTA (disodium salt), lmM DTT, 0.25mM Pefabloc hydrochloride, luM 
Leupeptin hemisulphate, 1 uM Pepstatin A, 0.5uM Trypsin inhibitor, and 20% w/v glycerol) 
was added to each well of the microtitre plate. Each well contained either a 3ul test sample 
5 made up in DMSO or 3ul DMSO alone (NB. Final DMSO concentrations in the assay were 
1 .48% v/v). The microtitre plates were placed in a water bath maintained at 37°C. lOul of 
[ 14 C] NaHC0 3 containing 9.25kBq in 378mM NaHC0 3 was then added to each well. The 
reaction was initiated by the addition of lOOul of Acetyl Coenzyme A containing assay 
buffer (50mM Tris pH7.5 containing 4.41mM ATP(disodium salt), 2.1mM Acetyl 
10 Coenzyme A, 2.52mM DTT, 10.5mM MgCl 2 , and 0.21% w/v Albumin [Bovine, fraction 
V]), (removed from ice 5 minutes before use) to each well. The tubes were incubated at 37°C 
for 5 minutes. The reaction was then terminated by the addition of 50ul of 6M HC1 to each 
well. In parallel, a pre-stopped assay control was set up which involved adding the 50ul of 
6M HC1 prior to [ ,4 C] NaHC0 3 and the assay buffer (No further HC1 additions were made to 
1 5 these wells after the 5 minute incubation). The DPM values for the pre-stopped assay were 
subtracted from the normal assay situation. 

After the addition of the stop reagent the plates were left open in the water bath for a 
further 30 minutes to allow the M C0 2 to escape. After this time 150ul of each reaction 
mixture were applied onto individual GF/C glass microfibre filter discs and allowed to dry 
20 thoroughly before adding scintillation fluid. Radioactivity in the samples was then 
determined by scintillation counting (Wallac WinSpectral 1414, Turku, Finland). 

IC50's were calculated from the data using non-linear regression techniques 
available in the ORIGIN software package (Microcal Software Inc., Massachusetts, USA). 
Soraphen A which is a specific inhibitor of ACCase was supplied over the range 
25 O.lnM-lOOuM in the dose response regimen of the assay. 
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Protein determination 

The total protein concentration of each ACCase preparation used was determined by 
the Coomasie Blue method (Pierce, Illinois, USA), (using 1cm path length cuvettes read 
595nm (Philips PU8630 UV/VIS/NIR Spectrophotometer). 

5 

In-vitro antifungal activity 

Compounds were tested over a concentration range of 1024 - 0.00098 ug/ml by a 
broth-dilution method in microtitre plates using doubling dilutions in YPD or YPGal (both 
containing 500ug/l d-biotin). Stock solutions of inhibitors were prepared at 5 1 .2mg/ml in 
10 Dimethyl sulphoxide (DMSO) (final assay concentration of DMSO was 2% v/v). Each 
Yeast culture was added to the well to give a final 10 4 cfu/well. The plates were incubated at 
30°C for 48h and MIC's determined visually. 



Discussion 

15 Expression of ACCase, a biotinylated protein, was monitored by a "biotin-avidm 

affinity western blot" as described by HaBlacher et al., 1993. Expression of the C. albicans 
ACC1 gene from its own promoter from plasmid Yep24 was comparable to that of the S. 
cerevisiae gene (no overexpression). Expression under control of the GAL1 promoter 
however, was considerably higher indicating a drastically increased level of biotinylated and 

20 therefore fully functional enzyme. Transcription of the gene was fully induced as the cells 
had to be grown on galactose to be viable. On glucose the GAL1 promoter is completely off, 
causing the cells to arrest and eventually die due to insufficient supply of ACCase). The S. 
cerevisiae strain described in this application is a convenient source of the C. albicans 
enzyme. The engineered strain possesses no residual background ACCase because the gene 

25 coding for the S. cerevisiae enzyme had been removed. Congenic versions of such a strain 
(genetically identical apart from the ACCase gene carried) expressing different ACCases 
(e.g. the different human (Abu-Elheiga et al. 1995), mammalian (Lopez-Casillas et al., 1988, 
Takai et al. 1988, Barber et al., 1995)), plant (Schulte et al., 1994) or other fungal enzymes 
(Al-Feel et al., 1992, Saito et al., 1996, Bailey et al., 1995) ) can be used as tools for 
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screening. Differences in growth of such strains may be solely dependent on differences in 
their ACCase activity. Differential growth in the presence of ACCase inhibitors (for 
example soraphenA or compounds yet to be identified) indicates selectivity of the drug 
towards one type of the ACCase enzyme. 

5 
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Claims: 

1 . A polynucleotide encoding an Acetyl-COA-carboxylase (ACCase) gene from Candida 
albicans. 

5 

2. A polynucleotide as claimed in claim 1 and as set out in Figure 4 herein. 

3. A polynucleotide as claimed in claim 2 and characterised by the start codon atg2. 

10 4. A polynucleotide comprising a restriction fragment of a polynucleotide as claimed in 
any one of claims 1-3. 

5. A polynucleotide probe comprising a polynucleotide as claimed in any one of claims 
1-4. 

15 

6. An Acetyl-COA-carboxylase (ACCase) polypeptide from Candida albicans in isolated 
and purified form. 

7. A polypeptide as claimed in claim 6 and as set out in Figure 5. 

20 

8. A polypeptide as claimed in claim 7 and characterised by Met2. 

9. A polypeptide as claimed in claim 6 and obtained by expression of a polynucleotide as 
claimed in any one of claims 1 -4. 

25 

10. Antibodies specific for a polypeptide as claimed in any one of claims 6-9. 

11. An antisense polynucleotide specific for all or a part of a polynucleotide as claimed in 
any one of claims 1-4. 

30 
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12. An RNA transcript corresponding to a polynucleotide as claimed in any one of claims 
1-4. 

13. An expression system for expression of an Acetyl-COA-carboxylase (ACCase) 
5 polypeptide from Candida albicans which system comprises an S. cerevisiae host strain 

having a Candida albicans ACC1 polynucleotide as claimed in any one of claims 1-3, 
inserted in place of the native ACC1 gene from S. Cerevisiae, whereby the Candida albicans 
ACC1 polypeptide is expressed. 



10 14. An expression system as claimed in claim 1 3 and adapted for controlled 

overexpresssion of the Candida albicans polynucleotide relative to expression under the 
control of a Candida albicans promoter 

15. An expression system as claimed in claim 1 4 and used to provide an Acetyl-COA- 
15 carboxylase (ACCase) gene from Candida albicans in sufficient quantiy and with sufficient 
activity for compound screening purposes. 



1 6. Use of an Acetyl-COA-carboxylase (ACCase) polypeptide from Candida albicans as 
claimed in claim 6, in an assay to identify inhibitors of the polypeptide. 

20 



17. 



Use as claimed in claim 16 in pharmaceutical research. 
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FIGURE 1 



GCACGCTTGACGGTTTTCACCAAATGCGAAAATATGACCAAATTGAGAATCCGAAAATGA 
ATGGATAGAAGATTGGTTACCAACTGAGAAATAACCCCACACATTAGAAGAAGAACGGAA 
ATTCAATTCATGTAAAGAACCACCACTTGGTTTAAAACCTTCACCAGGATCTTCAGAAGT 
AATACGACAAGCAGTACAATGTCCCTTTGGTGTTGGTCTTCTTTGACTAACCAATGAAGT 
TTCTGACTTGAATTCAAAATCAATATCAGTAGTGGTATGAGGATCGGCACCGCACAAAGT 
TCTGATATCTCTGATTCTATGCATTGGTATACCCATAGCAATTTGTAATTGAGCAGCTGG 
TAAATTAACACCTGTCACCATTTCAGTGGTTGGATGTTCAACTTGCAATCTTGGGTTCAA 
TTCCAAAAAGTAGAATTTATCTTCAGCGTGGGGAGTAAAGGTACTCAACAGTACCAGGGG 
GTTACATAACCAACTTATTTTACCCAATCTGACTGGTGGATTT 
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FIGURE 2 
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FIGURE 3 
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FIGURE 4 



AATATATTGCTTCCTTTTGATAGGAAGTAACTCCGAGTGTTTGAATTTGATATATGTTATTCATATACGTTCAATGGCTC 
TCTTCTATGCTTTGTATATACTTTCTTTTGAATAGATACTCATGTAAAGAGATTTGAAACCATATTCTAACCAACAAAAA 
5 TATTGTACGGTATAGGTTAGAAAAAAAACTCCGTAAGGTCCGCTTACACGGTTAAATTGAAAACACGTTAAAAATATATT 
TGGGTAATGGACTAAGCTATATACAGTACTCAACAAAAATGAAATCAAACACAATGTTCTTTGGGAAATTCATTTCATGC 
AACTAGGGTGATTCTCTTTCTACTATCCAACAACGATAACCCTGCTTTTGAAAAATCTTTTCTAAATTCAAATTGATATA 
ATTCTTATTTATATATTACTTTCTTTTTCCCATATAACCCCATTTTTTTTTTGGAATCATATTTGTTTTTGATTTTTGCT 
TTCCCTTTCAGTCTGAGGAACATACTAATTACGAACAACAATTATACATCCAATCTTCATCTAACGAATTGATTATTTAC 

10 ATT TAT TAAAC CCT T GGATAC AAACTG AT T AC ACTTTTT AGTT AGTTTGTTCAAT T AT AAGGGT ATT AT ACAAC AAAGAT 
ATCATTTAAAGTTAAATCTCAATCTGGAATAATAAAAGTATTCAACACTTTTGCTTACAATAGGTATGTTCAAAATCAAT 
TGAAGCCATCGAGATAAGAAATTAAGCAAAAACGTTTACAATTGTTGTGTGTGTGTTGCAGTGTTTGAAGAAGCTCGAGT 
GATTGCTTTTCTTCGGCATCAGCTGTGTTGGGAACATCTTGTCGTTAAAGTTTCGGAGTAATATTAGAGTAATGGAACGA 
AAAAAACAAAAT AAAGTT CTGGAAC C AC AAAGAT TTGAAAAATTGGGT AG AAAC AAAAAAAAG AC AAAGC AGG AACC CAA 

1 5 CAATAAATGAATAAACACTCAAAAACTACTCACAACAACAACACTTATTTTCACTTGCTTTATTTCTTCGATTTTTTatg 
AGATGCAAATTATCTCTAATAAAGAATACTAACTCACTTGTACATAGATCGCGTTTCCTAATTACAAAACCACAACTATA 
TATACCTCATCGTCATTATATCCCATTCAAGAACATATTCAAGTCATTGTTAatgTCAGATCAATCTCCATCTCCTAGTC 
CTAGCGATTCCCTTAGCTACACTACATTACATGAAAATTTGCCATCTCATTTCTTGGGTGGAAATTCAGTTTTGAATGCT 
GAACCTTCTAAAGTCAGAGACTTTGTCAGAGCTCATCAAGGTCATACAGTTATTTCGAAAATTTTAATTGCCAACAATGG 

20 TAT AG CT GC AGT T AAAGAAATCAGATCAGTT AGAAAATG GG CTTATGAAAC ATTTGGTGACG AAAAAGCCATACAGTTTA 
CCGTTATGGCCACTCCAGAAGATTTGGAAGCTAATGCCGAATATATTAGAATGGCCGACCAATTCATTGAAGTCCCTGGT 
GGCACCAATAACAATAACTATGCTAATGTTGATCTCATTGTAGAGATAGCAGAAAGTACAAATGCTCATGCCGTTTGGGC 
TGGGTGGGGGCATGCTTCAGAGAATCCTTTGTTACCAGAAAAATTAGCTGCATCTCCCAAAAAAATTATTTTTATTGGTC 
CTCCTGGTTCAGCTATGAGATCTTTAGGTGACAAGATTTCATCTACTATAGTTGCTCAACATGCTCAAGTACCATGTATT 

25 CCATGGTCCGGTACTGGTGTTGATGAAGTGAAAATAGACCCACAAACTAATTTGGTTTCTGTTGCTGATGATATTTATGC 
CAAAGGGTGCTGTACTAGTCCAGAAGATGGTTTAGAAAAAGCCAAAAAAATTGGGTTCCCAGTTATGATTAAAGCCTCTG 
AAGGTGGTGGTGGTAAAGGTATTAGAAAAGTTGATGATGAGAAAAACTTCATTACCTTATACAACCAAGCAGCTAATGAA 
ATACCAGGTTCTCCTATCTTTATTATGAAGTTAGCAGGTGATGCCAGACATTTAGAAGTTCAATTACTAGCAGATCAATA 
CGGTACTAACATTTCCCTTTTTGGAAGAGATTGTTCCGTACAAAGAAGACACCAAAAGATTATTGAAGAAGCACCAGTCA 

30 CCATTGCCAGAAAGGAAACTTTCCACGAAATGGAAAATGCAGCAGTCAGATTGGGTAAATTAGTTGGTTATGTATCCGCT 
GGTACTGTTGAGTATCTTTACTCCCACGCTGAAGATAAATTCTACTTTTTGGAATTGAACCCAAGATTGCAAGTTGAACA 
TCCAACCACTGAAATGGTGACAGGTGTTAATTTACCAGCTGCTCAATTA^AAATTGCTATGGGTATACCAATGCATAGAA 
T C AG AG ATATC AGAAC TT TGTACGGT G C CGAT CCTCAT ACCACT ACT G/iT ATTGATTT TGAATT CAAGTC AGAAACTTCA 
TTGGTTAGTCAAAGAAGACCAACACCAAAGGGACATTGTACTGCTTGTCGTATTACTTCTGAAGATCCTGGTGAAGGTTT 

35 TAAACCAAGTGGTGGTTCTTTACATGAATTGAATTTCCGTTCTTCTTCTAATGTGTGGGGTTATTTCTCAGTTGGTAACC 
AATCTTCTATCCATTCATTTTCGGATTCTC^TTTGGTCATATTTTCGCATTTGGTGAAAACCGTCAAGOT 
CATATGGTTGTTGCCTTGAAAGAATTGAGTATTAGAGGTGATTTTAGAACTACTGTTGAGTATTTAATCAAATTGTTAGA 
AACTCCAGATTTCGAGGATAATACCATTACAACTGGTTGGTTGGATGAATTAATCACCAAAAAGTTGACTGCTGAAAGAC 
CAGATCCAATAGTTGCTGTTGTTTGTGGAGCTGTAACCAAAGCACACATCCAGGCTGAGGAAGAGAAAAAGGAATACATC 

40 CAATCTTTGGAAT^AAGGTCAAGTTCCTCAC^GAAACTTATTGAAAACTATTTTCCCAGTTGAGTTTATTTATGAAGGTGA 
AAGATACAAGTTCACTGCTACTAAATCTTCAGAAGATAAATATACTTTGTTCCTTAATGGTTCTCGTTGTGTTGTTGGTG 
CACGTTCATTGTCCGATGGTGGTTTATTGTGTGCATTAGATGGGAAATCACATTCTGTCTATTGGAAGGAAGAGGCATCT 
GCCACTAGATTATCAGTTGATGGCAAAACTTGTTTATTAGAAGTTGAAMTGATCCAACAC^ATTAAGAACTCCATCTCC 
AGGTAAATTGGTCAAGTATTTGGTTGACAGTGGTGAACATGTTGATGCTGGTCAACCATACGCTGAAGTCGAAGTTATGA 

45 AAATGTGTATGCCTTTGATTGCTCAAGAAAATGGGGTAGTGCAGTTGATTAAACAACCGGGTTCCACAGTTAATGCTGGT 
GATATCTTGGCCATTTTGGCATTGGACGATCCATCTAAGGTCAAACATGCTAAACCATTTGAAGGTACTTTACCATCTAT 
GGGTGAGCCAAATGTTACAGGTACTAAACCAGCACATAAATTCAATCATTGTGCTGGTATTTTGAAAAACATTTTGGCTG 
GTTATGATAATCAAGTGATTTTGAATTCTACTTTAAAGAGTCTTGGTGAAGTTTTGAAAGACAATGAATTGCCATACTCT 
GAATGGC^^CyVACAAATTTCAGCTTTACACTCCAGATTGCCACCTAAATTGGATGACGGATTGACTGCAT^ 

50 AACTCAAAGTAGAGGTGCTGAATTCCCTGCTCGTCAAATTTTAAAACTCATCACCAAATCAATTGCT 

ATATGTTAGAAGATGTTGTTGCACt^TTGGTTTCTATTGCCACAAGTTACCAGAATGGTTTGGTTGAACACGAATACGAT 
TACTTTGCATCTTTGATTAACGAATATTATGACGTTGAAAGTTTGTTTTCAGGTGAAAATGTTAGAGAAGATAATGTTAT 
CTTGAAATTAAGAGATGAAAACAAATCTGATTTGAAAAAAGTTATTGGTATTGGTTTGTCTCATTCACGTGTTAGTGCCA 
AGAACAATTTGATTTTAGCTATTTTGGACATTTATGAACCATTGTTGCAATCCAACTCGTCAGTTGCTGCCTCTATCAGA 

5 5 GAAGCTTTAAAGAACTTGTTCATTAGACCTCGTGCTTGTGCCAAAGTTGCATTAAAGGCAAGAGAAATTTTAATTCAAT 
TTCTTTACCTTCCATCAAGGAAAGATCCGATCAATTGGAACATATTTTGAGGTCATCTGTC 

AAATTTTTGCTAAACATAGAGAACCAAATTTGGAAATTATTCGTGAGGTTGTTGATTCCAAACATATTGTTTTTGATGTG 
TTGGCACAATTCTTAATCAATCCAGACCCATGGGTTGCC^TTGCTGCCGCTGAAGTTTATGTCAGACGTTCATACCGTGC 
TTATGATTTGGGTAAAATTGAATATCATGTTAATGACAGACTTCCTATTGTTGAATGGAAATTCAAGTTGGCTAATATGG 
60 GAGCCGCTGGTGTAAACGATGCTCAACAGGCTGCTGCTGCCGGTGGCGATGATTCGACATCTATGAAACATGCAGCTTCT 
GTGTCTGATTTGACCTTTGTTGTTGATTCTAAAACCGAGCATTCCACAAGAACTGGTGTTTTAGCTCCAGCAAGACACTT 
GGATGATGTTGATGAAACTCTTACAGCTGCATTGGAACAATTCCAACCAGCCGATGCTATTTCATTTAAAGCAAAGGGTG 
AAACTCC AGAGTT AT T AAATGTTTTG AAT ATT GT C ATTACCAGTATTGATGGTTACTCCGAT GAAAATGAAT ACTTG AGC 
AGAATTAATGAAATCTTGTGCGAATACAAAGAAGAGTTGATTTCTGCTGGTGTTCGTCGTGTTACATTTGTTTTTGCTCA 



WO 99/32635 



PCT/GB98/03857 



5/8 



T CAAATTGGT C AAT AT CCTAAAT ATTATACTT TT ACTGGTCCTG ACT ATGAAG AAAACAAGGTT ATT AGACACAT TG AAC 
CAGCTTTGGCTTTCCAATTGGAATTGGGAAGATTAGCCAATTTCGATATCAAACCAATTTTCACTAACAACAGAAACATC 
CATGTATATGATGCAATTGGGAAGAATGCTCCTTCTGATAAAAGATTTTTCACCAGAGGGATTATTAGAACCGGTGTTCT 
TAAAGAAGACATTAGCATTAGTGAATATTTGATTGCTGAATCCAACAGATTAATGAATGATATTTTGGATACTTTAGAAG 
5 TTATTGACACTTCTAATTCTGATTTAAACCATATTTTCATTAACTTTTCCAATGCTTTCAATGTTCAAGCTTCAGATGTT 
GAGGCTGCCTTTGGATCATTCTTAGAAAGATTTGGTAGAAGATTATGGAGATTAAGAGTTACTGGTGCTGAAATTAGAAT 
TGTCTGTACTGATCCTCAAGGTACTTCGTTCCCATTGCGTGCTATCATTAATAATGTTTCTGGTTATGTTGTCAAATCAG 
AATTGTATTTGGAAGTGAAAAATCCTAAAGGTGAATGGGTTTTCAAATCCATTGGTCATCCTGGTTCCATGCATTTGAGA 
CCTATCTCAACTCCATATCCAGTTAAAGAATCTTTACAACCAAAACGTTACAAGGCTCACAATATGGGTACCACTTATGT 

1 0 GTATGACTTCCCAGAATTGTTTCGTCAAGCAACAATTTCACAATGGAAAAAATATGGCAAAAAAGTACCAAAAGATGTTT 
TCGTGTCTTTAGAATTGATCACTGATGAAACTGATTCCTTAATAGCTGTTGAAAGAGATCCGGGTGCTAACAAAATTGGA 
ATGGTTGGATTCAAAGTCACTGCTAAAACTCCTGAATACCCTCATGGTCGTCAATTAATTATTGTTGCCAATGATATCAC 
CCACAAGATTGGTTCTTTTGGTCCAGAAGAAGATAATTATTTCAACAAGTGTACTGAATTGGCCAGAAAATTAGGTATTC 
CAAGAATTTACCTTTCTGCAAATTCAGGTGCTAGAATTGGTGTTGCTGAGGAATTGATTCCATTATACCAAGTTGCCTGG 

15 AATGAAGAAGGGTCTCCTGACAAAGGATTCAGATACTTGTACTTGAGTACTGCTGCTAAAGAGTCTTTAGAAAAAGATGG 
TAAAAGTGACAGTGTTGTTACTGAACGTATTGTTGAAAAAGGTGAAGAGCGTCATGTCATTAAAGCTATTATTGGTGCCG 
AAGATGGCTTAGGGGTTGAATGTCTTAAAGGATCAGGTTTAATTGCTGGTGCCACATCAAGAGCTTACAAGGATATATTT 
ACCATCACTTTGGTAACTTGTAGATCTGTTGGTATTGGTGCTTATTTGGTTAGATTGGGTCAAAGAGCCATTCAAATCGA 
TGGTCAACCTATTATTTTAACTGGTGCTCCTGCTATCAATAAATTGTTGGGTAGAGAAGTGTATTCTTCCAATCTTCAAT 

20 TGGGTGGTACTCAAATCATGTACAATAATGGTGTTTCTCATTTGACAGCTAATGATGATTTGGCTGGGGTTGAAAAAATT 
ATGGAATGGTTATCATATGTTCCAGCTAAACGTGGTTTACCAGTGCCAATTTTGGAATCAGAAGATTCTTGGGACAGAGA 
TGTTGATTACTACCCACCAAAACAAGAAGCTTTTGATGTTAGATGGATGATCCAAGGTAGAGAAGTTGATGGTGAATATG 
AATCTGGGTTATTTGATAAAGATTCATTCCAAGAAACATTATCTGGTTGGGCTAAAGGTGTTGTTGTTGGTAGAGCACGT 
TTGGGTGGTATTCCAATTGGTGTTATTGGTGTCGAAACCAGAACAGTGGAAAACTTGATTCCTGCTGATCCAGCAAATCC 

25 AGACTCTACAGAAAGTTTGATTCAAGAAGCAGGTCAAGTGTGGTATCCTAACTCTGCTTTTAAGACAGCACAAGCTATAA 
ATGATTTC AAC AATGGTG AACAAT TGC C ATT AATG ATTTT AG CAAAT TGGAG AGGTTT CT CTGGTGGT CAAAGAGAT ATG 
TACAATGAAGTCTTGAAATATGGTTCATTTATTGTTGATGCTTTAGTTGACTTCAAGCAACCTATCTTCACTTACATTCC 
ACCAAATGGAGAATTGAGAGGTGGCTCTTGGGTTGTTGTTGATCCAACCATCAACTCAGATATGATGGAAATGTATGCCG 
ATGTCGATTCGAGAGCTGGTGTTTTGGAACCAGAAGGTATGGTTGGTATCAAATACAGACGTGATAAATTATTAGCAACT 

30 ATGGAAAGATTAGATCCAACTTATGGTGAAATGAAAGCTAAGTTAAATGACTCGTCATTATCTCCAGAAGAACACTCGAA 
AATAAGCGCCAAATTGTTTGCACGTGAAAAGGCTTTATTACCAATTTATGCTCAAATTTCCGTTCAATTTGCTGACTTGC 
ACGATAGATCAGGTCGTATGTTGGCCAAGGGAGTTATTAGAAAGGAAATCAAATGGACTGATGCTAGACGTTTCTTCTTC 
TGG AGATTGAGAAG AAC ATT GAACG AGGAATATGTTTTGAGATTG ATT AGTG AACAAATT AAAGAT T CT AG CAAATT GGA 
AAG AGTTGC C AGAT TG7JVGAGT TGGATG CCAACTGTTGAAT ACGATG ATG ACCAAGCTGTCAGT AACTGGATT GAAG AG A 

3 5 ACC ATG C CAAAT TGCAAAAG AG AGTT AAT GAATTGAAACAAG AAGTTTCAAG AACCAAG AT TATGAGATTATT AAAAGAG 
GAT C CAAATAGTGCAATT T CTG CAATGAAAGACTATGTTG AAAGATTGTCAAAAG AAGATAAAGAGAAATTCCTCAAGGC 
ATTGAAGt^AAGTGGTTTCCATTAATTCAACTTTTTAATGACATTGAAAGTAGTAGTAGTTGTTGTTTTTTAGATTTAA 
GTATATTATATTATGTAATAAATTATAGAAAGTAATTATAGTTTTGACGGTTAATTGACGAGAGTGGGAAATTGGCTTTT 
TTGTTGCTCGTGTGATGAAACAGTGATTGACACAAAAAAATAGACAATGAAAAC 

40 
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FIGURES 



MRCKLSLIKNTNSLVHRSRFLITKPQLYIPHRHYIPFKNIFKSLLHSDQSPSPSPSDSLSYTTLHENLPSHFLGGNSVLN 
5 AEPSKVRDFVRAHQGHTVISKILIANNGIAAVKEIRSVRKWAYETFGDEKAIQFTVMATPEDLEANAEYIRMADQFIEVP 
GGTNNNNYANVDLIVEIAESTNAHAVWAGWGHASENPLLPEKLAASPKKIIFIGPPGSAMRSLGDKISSTIVAQHAQVPC 
IPWSGTGVDEVKIDPQTNLVSVADDIYAKGCCTSPEDGLEKAKKIGFPVMIKASEGGGGKGIRKVDDEKNFITLYNQAAN 
EIPGSPIFIMKLAGDARHLEVQLLADQYGTNISLFGRDCSVQRRHQKIIEEAPVTIARKETFHEMENAAVRLGKLVGYVS 
AGTVEYLYSHAEDKFYFLELNPRLQVEHPTTEMVTGVNLPAAQLQIAMGIPMHRIRDIRTLYGADPHTTTDIDFEFKSET 

10 SLVSQRRPTPKGHCTACRITSEDPGEGFKPSGGSLHELNFRSSSNVWGYFSVGNQSSIHSFSDSQFGHIFAFGENRQASR 
KHMWALKELSIRGDFRTTVEYLIKLLETPDFEDNTITTGWLDELITKKLTAERPDPIVAWCGAVTKAHIQAEEEKKEY 
IQSLEKGQVPHRNLLKTIFPVEFIYEGERYKFTATKSSEDKYTLFLNGSRCWGARSLSDGGLLCALDGKSHSVYWKEEA 
SATRLSVDGKTCLLEVENDPTQLRTPSPGKLVKYLVDSGEHVDAGQPYAEVEVMKMCMPLIAQENGWQLIKQPGSTVNA 
GDILAI1JU.DDPSKVKHAKPFEGTLPSMGEPNVTGTKPAHKFNHCAGIIJCNILAGYDNQVILNSTLKSLGEVLKDNELPY 

15 SEWQQQISALHSRLPPKLDDGLTALVERTQSRGAEFPARQILKLITKSIAENGNDMLEDWAPLVSIATSYQNGLVEHEY 
DYFASLINEYYDVESLFSGENVREDNVILKLRDENKSDLKKVIGIGLSHSRVSAKNNLILAILDIYEPLLQSNSSVAASI 
REALKNL F I RPRACAKVALKARE ILIQCSLPSIKERS DQLE H I LRS S WQT S YGE I FAKHRE PN LE 1 1 RE W D SKH I V FD 
VIAQFLINPDPWVAIAAAEVYVRRSYRAYDLGKIEYHVNDRLPIVEWKFKIJ^GAAGVNDAQQAAAAGGDDSTSMK^ 
SVSDLTFWDSKTEHSTRTGVLAPARHLDDVDETLTAALEQFQPADAISFKAKGETPELLNVLNIVITSIDGYSDENEYL 

20 SRINEILCEYKEELISAGVRRVTFVFAHQIGQYPKYYTFTGPDYEENKVIRHIEPALAFQLELGRLANFDIKPIFTNNRN 
IHVYDAIGKNAPSDKRFFTRGIIRTGVLKEDISISEYLIAESNRLMNDILDTLEVIDTSNSDLNHIFINFSNAFNVQASD 
VEAAFGSFLERFGRRLWRLRVTGAEIRIVCTDPQGTSFPLRAIINNVSGYWKSELYLEVKNPKGEWVFKSIGHPGSMHL 
RPISTPYPVKESLQPKRYKAHNMGTTYVYDFPELFRQATISQWKKYGKKVPKDVFVSLELITDETDSLIAVERDPGANKI 
GMVGFKVTAKTPEYPHGRQLIIVANDITHKIGSFGPEEDNYFNKCTELARKLGIPRIYLSANSGARIGVAEELIPLYQVA 

25 WNEEGSPDKGFRYLYLSTAAKESLEKDGKSDSWTERIVEKGEERHVIKAIIGAEDGLGVECLKGSGLIAGATSRAYKDI 
FTITLVTCRSVGIGAYLVRLGQRAIQIDGQPIILTGAPAINKLLGREVYSSNLQLGGTQIMYNNGVSHLTANDDLAGVEK 
I ME WLS YV PAKRGL PVPILESEDSW DR D VDY YPPKQEAFDVRWMI QGRE VDGE YE S GL FDKDS FQETLSGWAKG VWGRA 
RLGGIPIGVIGVETRTVENLIPADPANPDSTESLIQEAGQVWYPNSAFKTAQAINDFNNGEQLPLMIIANWRGFSGGQRD 
MYNEVLKYGSFIVDALVDFKQPIFTYIPPNGELRGGSWVVVDPTINSDMMEMYADVDSRAGVLEPEGMVGIKYRRDKLLA 

30 TMERLDPTYGEMKAKLNDSSLSPEEHSKISAKLFAREKALLPIYAQISVQFADLHDRSGRMLAKGVIRKEIKWTDARRFF 
FWRLRRRLNEEYVLRXISEQIKDSSKLERVARLKSWMPTTOYDDDQAVSIWIEEN 
EDPNSAI SAMKDYVERLSKEDKEKFLKALK 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: Zeneca Ltd 

(B) STREET: 15 Stanhope Gate 

(C) CITY: London 

10 (D). STATE: Greater London 

(E) COUNTRY: England 

(F) POSTAL CODE (ZIP): W1Y 6LN 

(G) TELEPHONE: 0171 304 5000 

(H) TELEFAX: 0171 304 5151 
15 (I) TELEX: 0171 834 2042 

<ii) TITLE OF INVENTION: PROCESS 

(iii) NUMBER OF SEQUENCES: 3 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
<B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9726897.3 

(B) FILING DATE: 20-DEC-1997 

30 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 523 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GCACGCTTGA CGGTTTTCAC CAAATGCGAA AATATGACCA AATTGAGAAT CCGAAAATGA 60 



20 



25 
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120 
180 
240 
300 
360 
420 
480 
523 

10 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8054 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

20 



ATGGATAGAA GATTGGTTAC CAACTGAGAA ATAACCCCAC ACATTAGAAG AAGAACGGAA 
ATTCAATTCA TGTAAAGAAC CACCACTTGG TTTAAAACCT TCACCAGGAT CTTCAGAAGT 
AATACGACAA GCAGTACAAT GTCCCTTTGG TGTTGGTCTT CTTTGACTAA CCAATGAAGT 
TTCTGACTTG AATTCAAAAT CAATATCAGT AGTGGTATGA GGATCGGCAC CGCACAAAGT 
5 TCTGATATCT CTGATTCTAT GCATTGGTAT ACCCATAGCA ATTTGTAATT GAGCAGCTGG 
TAAATTAACA CCTGTCACCA TTTCAGTGGT TGGATGTTCA ACTTGCAATC TTGGGTTCAA 
TTCCAAAAAG TAGAATTTAT CTTCAGCGTG GGGAGTAAAG GTACTCAACA GTACCAGGGG 
GTTACATAAC CAACTTATTT TACCCAATCT GACTGGTGGA TTT 



<xi) SEQUENCE DESCRIPTION: SI 

25 AATATATTGC TTCCTTTTGA TAGGAAGTAA 
TCATATACGT TCAATGGCTC TCTTCTATGC 
CATGTAAAGA GATTTGAAAC CATATTCTAA 
AAAAAAAACT CCGTAAGGTC CGCTTACACG 
TGGGTAATGG ACTAAGCTAT ATACAGTACT 

30 TTGGGAAATT CATTTCATGC AACTAGGGTG 
CCTGCTTTTG AAAAATCTTT TCTAAATTCA 
TTCTTTTTCC CATATAACCC CATTTTTTTT 
TTCCCTTTCA GTCTGAGGAA CATACTAATT 
CTAACGAATT GATTATTTAC ATTTATTAAA 

35 GTTAGTTTGT TCAATTATAA GGGTATTATA 
AATCTGGAAT AATAAAAGTA TTCAACACTT 
TGAAGCCATC GAGATAAGAA ATTAAGCAAA 
GTGTTTGAAG AAGCTCGAGT GATTGCTTTT 
GTCGTTAAAG TTTCGGAGTA ATATTAGAGT 

40 GGAACCACAA AGATTTGAAA AATTGGGTAG 
CAATAAATGA ATAAACACTC AAAAACTACT 
TATTTCTTCG ATTTTTTATG AGATGCAAAT 
TACATAGATC GCGTTTCCTA ATTACAAAAC 
TCCCATTCAA GAACATATTC AAGTCATTGT 

45 CTAGCGATTC CCTTAGCTAC ACTACATTAC 
GAAATTCAGT TTTGAATGCT GAACCTTCTA 
GTCATACAGT TATTTCGAAA ATTTTAATTG 



ID NO: 2: 



CTCCGAGTGT 


TTGAATTTGA TATATGTTAT 


60 


TTTGTATATA 


CTTTCTTTTG AATAGATACT 


120 


CCAACAAAAA 


TATTGTACGG TATAGGTTAG 


180 


GTTAAATTGA 


AAACACGTTA AAAATATATT 


240 


CAACAAAAAT 


GAAATCAAAC ACAATGTTCT 


300 


ATTCTCTTTC 


TACTATCCAA CAACGATAAC 


360 


AATTGATATA 


ATTCTTATTT ATATATTACT 


420 


TTGGAATCAT 


ATTTGTTTTT GATTTTTGCT 


480 


ACGAACAACA 


ATTATACATC CAATCTTCAT 


540 


CCCTTGGATA 


CAAACTGATT ACACTTTTTA 


600 


CAACAAAGAT 


ATCATTTAAA GTTAAATCTC 


660 


TTGCTTACAA 


TAGGTATGTT CAAAATCAAT 


720 


AACGTTTACA 


ATTGTTGTGT GTGTGTTGCA 


780 


CTTCGGCATC 


AGCTGTGTTG GGAACATCTT 


840 


AATGGAACGA 


AAAAAACAAA ATAAAGTTCT 


900 


AAACAAAAAA 


AAGACAAAGC AGGAACCCAA 


960 


CACAACAACA 


ACACTTATTT TCACTTGCTT 


1020 


TATCTCTAAT 


AAAGAATACT AACTCACTTG 


1080 


CACAACTATA 


TATACCTCAT CGTCATTATA 


1140 


TAATGTCAGA 


TCAATCTCCA TCTCCTAGTC 


1200 


ATGAAAATTT 


GCCATCTCAT TTCTTGGGTG 


1260 


AAGTCAGAGA 


CTTTGTCAGA GCTCATCAAG 


1320 


CCAACAATGG 


TATAGCTGCA GTTAAAGAAA 


1380 
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TCAGATCAGT TAGAAAATGG GCTTATGAAA CATTTGGTGA CGAAAAAGCC ATACAGTTTA 1440 
CCGTTATGGC CACTCCAGAA GATTTGGAAG CTAATGCCGA ATATATTAGA ATGGCCGACC 1500 
AATTCATTGA AGTCCCTGGT GGCACCAATA ACAATAACTA TGCTAATGTT GATCTCATTG 1560 
TAGAGATAGC AGAAAGTACA AATGCTCATG CCGTTTGGGC TGGGTGGGGG CATGCTTCAG 1620 
5 AGAATCCTTT GTTACCAGAA AAATTAGCTG CATCTCCCAA AAAAATTATT TTTATTGGTC 1680 
CTCCTGGTTC AGCTATGAGA TCTTTAGGTG ACAAGATTTC ATCTACTATA GTTGCTCAAC 1740 
ATGCTCAAGT ACCATGTATT CCATGGTCCG GTACTGGTGT TGATGAAGTG AAAATAGACC 1800 
CACAAACTAA TTTGGTTTCT GTTGCTGATG ATATTTATGC CAAAGGGTGC TGTACTAGTC 1860 

CAGAAGATGG TTTAGAAAAA GCCAAAAAAA TTGGGTTCCC AGTTATGATT AAAGCCTCTG 1920 

10 AAGGTGGTGG TGGTAAAGGT ATTAGAAAAG TTGATGATGA GAAAAACTTC ATTACCTTAT 1980 

ACAACCAAGC AGCTAATGAA ATACCAGGTT CTCCTATCTT TATTATGAAG TTAGCAGGTG 2040 

ATGCCAGACA TTTAGAAGTT CAATTACTAG CAGATCAATA CGGTACTAAC ATTTCCCTTT 2100 

TTGGAAGAGA TTGTTCCGTA CAAAGAAGAC ACCAAAAGAT TATTGAAGAA GCACCAGTCA 2160 

CCATTGCCAG AAAGGAAACT TTCCACGAAA TGGAAAATGC AGCAGTCAGA TTGGGTAAAT 2220 

15 TAGTTGGTTA TGTATCCGCT GGTACTGTTG AGTATCTTTA CTCCCACGCT GAAGATAAAT 2280 

TCTACTTTTT GGAATTGAAC CCAAGATTGC AAGTTGAACA TCCAACCACT GAAATGGTGA 2340 

CAGGTGTTAA TTTACCAGCT GCTCAATTAC AAATTGCTAT GGGTATACCA ATGCATAGAA 2400 

TCAGAGATAT CAGAACTTTG TACGGTGCCG ATCCTCATAC CACTACTGAT ATTGATTTTG 2460 

AATTCAAGTC AGAAACTTCA TTGGTTAGTC AAAGAAGACC AACACCAAAG GGACATTGTA 2520 

20 CTGCTTGTCG TATTACTTCT GAAGATCCTG GTGAAGGTTT TAAACCAAGT GGTGGTTCTT 2580 

TACATGAATT GAATTTCCGT TCTTCTTCTA ATGTGTGGGG TTATTTCTCA GTTGGTAACC 2640 

AATCTTCTAT CCATTCATTT TCGGATTCTC AATTTGGTCA TATTTTCGCA TTTGGTGAAA 2700 

ACCGTCAAGC TTCAAGAAAA CATATGGTTG TTGCCTTGAA AGAATTGAGT ATTAGAGGTG 2760 

ATTTTAGAAC TACTGTTGAG TATTTAATCA AATTGTTAGA AACTCCAGAT TTCGAGGATA 2820 

25 ATACCATTAC AACTGGTTGG TTGGATGAAT TAATCACCAA AAAGTTGACT GCTGAAAGAC 2880 

CAGATCCAAT AGTTGCTGTT GTTTGTGGAG CTGTAACCAA AGCACACATC CAGGCTGAGG 2940 

AAGAGAAAAA GGAATACATC CAATCTTTGG AAAAAGGTCA AGTTCCTCAC AGAAACTTAT 3000 

TGAAAACTAT TTTCCCAGTT GAGTTTATTT ATGAAGGTGA AAGATACAAG TTCACTGCTA 3060 

CTAAATCTTC AGAAGATAAA TATACTTTGT TCCTTAATGG TTCTCGTTGT GTTGTTGGTG 3120 

30 CACGTTCATT GTCCGATGGT GGTTTATTGT GTGCATTAGA TGGGAAATCA CATTCTGTCT 3180 

ATTGGAAGGA AGAGGCATCT GCCACTAGAT TATCAGTTGA TGGCAAAACT TGTTTATTAG 3240 

AAGTTGAAAA TGATCCAACA CAATTAAGAA CTCCATCTCC AGGTAAATTG GTCAAGTATT 3300 

TGGTTGACAG TGGTGAACAT GTTGATGCTG GTCAACCATA CGCTGAAGTC GAAGTTATGA 3360 

AAATGTGTAT GCCTTTGATT GCTCAAGAAA ATGGGGTAGT GCAGTTGATT AAACAACCGG 3420 

35 GTTCCACAGT TAATGCTGGT GATATCTTGG CCATTTTGGC ATTGGACGAT CCATCTAAGG 3480 

TCAAACATGC TAAACCATTT GAAGGTACTT TACCATCTAT GGGTGAGCCA AATGTTACAG 3540 

GTACTAAACC AGCACATAAA TTCAATCATT GTGCTGGTAT TTTGAAAAAC ATTTTGGCTG 3600 

GTTATGATAA TCAAGTGATT TTGAATTCTA CTTTAAAGAG TCTTGGTGAA GTTTTGAAAG 3660 

ACAATGAATT GCCATACTCT GAATGGCAAC AACAAATTTC AGCTTTACAC TCCAGATTGC 3720 

40 CACCTAAATT GGATGACGGA TTGACTGCAT TGGTTGAAAG AACTCAAAGT AGAGGTGCTG 3780 

AATTCCCTGC TCGTCAAATT TTAAAACTCA TCACCAAATC AATTGCTGAA AATGGTAATG 3840 

ATATGTTAGA AGATGTTGTT GCACCATTGG TTTCTATTGC CACAAGTTAC CAGAATGGTT 3900 

TGGTTGAACA CGAATACGAT TACTTTGCAT CTTTGATTAA CGAATATTAT GACGTTGAAA 3960 

GTTTGTTTTC AGGTGAAAAT GTTAGAGAAG ATAATGTTAT CTTGAAATTA AGAGATGAAA 4020 

45 ACAAATCTGA TTTGAAAAAA GTTATTGGTA TTGGTTTGTC TCATTCACGT GTTAGTGCCA 4080 

AGAACAATTT GATTTTAGCT ATTTTGGACA TTTATGAACC ATTGTTGCAA TCCAACTCGT 4140 

CAGTTGCTGC CTCTATCAGA GAAGCTTTAA AGAACTTGTT CATTAGACCT CGTGCTTGTG 4200 
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CCAAAGTTGC ATTAAAGGCA AGAGAAATTT TAATTCAATG TTCTTTACCT TCCATCAAGG 4260 

AAAGATCCGA TCAATTGGAA CATATTTTGA GGTCATCTGT TGTTCAAACC TCTTATGGTG 4320 

AAATTTTTGC TAAACATAGA GAACCAAATT TGGAAATTAT TCGTGAGGTT GTTGATTCCA 4380 

AACATATTGT TTTTGATGTG TTGGCACAAT TCTTAATCAA TCCAGACCCA TGGGTTGCCA 4440 

5 TTGCTGCCGC TGAAGTTTAT GTCAGACGTT CATACCGTGC TTATGATTTG GGTAAAATTG 4500 

AATATCATGT TAATGACAGA CTTCCTATTG TTGAATGGAA ATTCAAGTTG GCTAATATGG 4560 

GAGCCGCTGG TGTAAACGAT GCTCAACAGG CTGCTGCTGC CGGTGGCGAT GATTCGACAT 4620 

CTATGAAACA TGCAGCTTCT GTGTCTGATT TGACCTTTGT TGTTGATTCT AAAACCGAGC 4680 

ATTCCACAAG AACTGGTGTT TTAGCTCCAG CAAGACACTT GGATGATGTT GATGAAACTC 4740 

10 TTACAGCTGC ATTGGAACAA TTCCAACCAG CCGATGCTAT TTCATTTAAA GCAAAGGGTG 4800 

AAACTCCAGA GTTATTAAAT GTTTTGAATA TTGTCATTAC CAGTATTGAT GGTTACTCCG 4860 

ATGAAAATGA ATACTTGAGC AGAATTAATG AAATCTTGTG CGAATACAAA GAAGAGTTGA 4 920 

TTTCTGCTGG TGTTCGTCGT GTTACATTTG TTTTTGCTCA TCAAATTGGT CAATATCCTA 4 980 

AATATTATAC TTTTACTGGT CCTGACTATG AAGAAAACAA GGTTATTAGA CACATTGAAC 5040 

15 CAGCTTTGGC TTTCCAATTG GAATTGGGAA GATTAGCCAA TTTCGATATC AAACCAATTT 5100 

TCACTAACAA CAGAAACATC CATGTATATG ATGCAATTGG GAAGAATGCT CCTTCTGATA 5160 

AAAGATTTTT CACCAGAGGG ATTATTAGAA CCGGTGTTCT TAAAGAAGAC ATTAGCATTA 5220 

GTGAATATTT GATTGCTGAA TCCAACAGAT TAATGAATGA TATTTTGGAT ACTTTAGAAG 5280 

TTATTGACAC TTCTAATTCT GATTTAAACC ATATTTTCAT TAACTTTTCC AATGCTTTCA 5340 

20 ATGTTCAAGC TTCAGATGT7 GAGGCTGCCT TTGGATCATT CTTAGAAAGA TTTGGTAGAA 5400 

GATTATGGAG ATTAAGAGTT ACTGGTGCTG AAATTAGAAT TGTCTGTACT GATCCTCAAG 5460 

GTACTTCGTT CCCATTGCGT GCTATCATTA ATAATGTTTC TGGTTATGTT GTCAAATCAG 5520 

AATTGTATTT GGAAGTGAAA AATCCTAAAG GTGAATGGGT TTTCAAATCC ATTGGTCATC 5580 

CTGGTTCCAT GCATTTGAGA CCTATCTCAA CTCCATATCC AGTTAAAGAA TCTTTACAAC 5640 

25 CAAAACGTTA CAAGGCTCAC AATATGGGTA CCACTTATGT GTATGACTTC CCAGAATTGT 5700 

TTCGTCAAGC AACAATTTCA CAATGGAAAA AATATGGCAA AAAAGTACCA AAAGATGTTT 5760 

TCGTGTCTTT AGAATTGATC ACTGATGAAA CTGATTCCTT AATAGCTGTT GAAAGAGATC 5820 

CGGGTGCTAA CAAAATTGGA ATGGTTGGAT TCAAAGTCAC TGCTAAAACT CCTGAATACC 5880 

CTCATGGTCG TCAATTAATT ATTGTTGCCA ATGATATCAC CCACAAGATT GGTTCTTTTG 5940 

30 GTCCAGAAGA AGATAATTAT TTCAACAAGT GTACTGAATT GGCCAGAAAA TTAGGTATTC 6000 

CAAGAATTTA CCTTTCTGCA AATTCAGGTG CTAGAATTGG TGTTGCTGAG GAATTGATTC 6060 

CATTATACCA AGTTGCCTGG AATGAAGAAG GGTCTCCTGA CAAAGGATTC AGATACTTGT 6120 

ACTTGAGTAC TGCTGCTAAA GAGTCTTTAG AAAAAGATGG TAAAAGTGAC AGTGTTGTTA 6180 

CTGAACGTAT TGTTGAAAAA GGTGAAGAGC GTCATGTCAT TAAAGCTATT ATTGGTGCCG 6240 

35 AAGATGGCTT AGGGGTTGAA TGTCTTAAAG GATCAGGTTT AATTGCTGGT GCCACATCAA 6300 

GAGCTTACAA GGATATATTT ACCATCACTT TGGTAACTTG TAGATCTGTT GGTATTGGTG 6360 

CTTATTTGGT TAGATTGGGT CAAAGAGCCA TTCAAATCGA TGGTCAACCT ATTATTTTAA 6420 

CTGGTGCTCC TGCTATCAAT AAATTGTTGG GTAGAGAAGT GTATTCTTCC AATCTTCAAT 6480 

TGGGTGGTAC TCAAATCATG TACAATAATG GTGTTTCTCA TTTGACAGCT AATGATGATT 6540 

40 TGGCTGGGGT TGAAAAAATT ATGGAATGGT TATCATATGT TCCAGCTAAA CGTGGTTTAC 6600 

CAGTGCCAAT TTTGGAATCA GAAGATTCTT GGGACAGAGA TGTTGATTAC TACCCACCAA 6660 

AACAAGAAGC TTTTGATGTT AGATGGATGA TCCAAGGTAG AGAAGTTGAT GGTGAATATG 6720 

AATCTGGGTT ATTTGATAAA GATTCATTCC AAGAAACATT ATCTGGTTGG GCTAAAGGTG 6780 

TTGTTGTTGG TAGAGCACGT TTGGGTGGTA TTCCAATTGG TGTTATTGGT GTCGAAACCA 6840 

45 GAACAGTGGA AAACTTGATT CCTGCTGATC CAGCAAATCC AGACTCTACA GAAAGTTTGA 6900 

TTCAAGAAGC AGGTCAAGTG TGGTATCCTA ACTCTGCTTT TAAGACAGCA CAAGCTATAA 6960 

ATGATTTCAA CAATGGTGAA CAATTGCCAT TAATGATTTT AGCAAATTGG AGAGGTTTCT 7020 
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CTGGTGGTCA AAGAGATATG TACAATGAAG TCTTGAAATA TGGTTCATTT ATTGTTGATG 7080 

CTTTAGTTGA CTTCAAGCAA CCTATCTTCA CTTACATTCC ACCAAATGGA GAATTGAGAG 7140 

GTGGCTCTTG GGTTGTTGTT GATCCAACCA TCAACTCAGA TATGATGGAA ATGTATGCCG 7200 

ATGTCGATTC GAGAGCTGGT GTTTTGGAAC CAGAAGGTAT GGTTGGTATC AAATACAGAC 7260 

5 GTGATAAATT ATTAGCAACT ATGGAAAGAT TAGATCCAAC TTATGGTGAA ATGAAAGCTA 7320 

AGTTAAATGA CTCGTCATTA TCTCCAGAAG AACACTCGAA AATAAGCGCC AAATTGTTTG 7380 

CACGTGAAAA GGCTTTATTA CCAATTTATG CTCAAATTTC CGTTCAATTT GCTGACTTGC 7440 

ACGATAGATC AGGTCGTATG TTGGCCAAGG GAGTTATTAG AAAGGAAATC AAATGGACTG 7500 

ATGCTAGACG TTTCTTCTTC TGGAGATTGA GAAGAAGATT GAACGAGGAA TATGTTTTGA 7560 

10 GATTGATTAG TGAACAAATT AAAGATTCTA GCAAATTGGA AAGAGTTGCC AGATTGAAGA 7620 

GTTGGATGCC AACTGTTGAA TACGATGATG ACCAAGCTGT CAGTAACTGG ATTGAAGAGA 7680 

ACCATGCCAA ATTGCAAAAG AGAGTTAATG AATTGAAACA AGAAGTTTCA AGAACCAAGA 7740 

TTATGAGATT ATTAAAAGAG GATCCAAATA GTGCAATTTC TGCAATGAAA GACTATGTTG 7800 

AAAGATTGTC AAAAGAAGAT AAAGAGAAAT TCCTCAAGGC ATTGAAGTAG AAGTGGTTTC 7860 

15 CATTAATTCA ACTTTTTAAT GACATTGAAA GTAGTAGTAG TTGTTGTTTT TTAGATTTAA 7920 

GTATATTATA TTATGTAATA AATTATAGAA AGTAATTATA GTTTTGACGG TTAATTGACG 7980 

AGAGTGGGAA ATTGGCTTTT TTGTTGCTCG TGTGATGAAA CAGTGATTGA CACAAAAAAA 8040 

TAGACAATGA AAAC 8054 

20 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2270 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

35 Met Arg Cys Lys Leu Ser Leu lie Lys Asn Thr Asn Ser Leu Val His 

15 10 15 

Arg Ser Arg Phe Leu lie Thr Lys Pro Gin Leu Tyr lie Pro His Arg 

20 25 30 

His Tyr He Pro Phe Lys Asn He Phe Lys Ser Leu Leu Met Ser Asp 
40 35 40 45 

Gin Ser Pro Ser Pro Ser Pro Ser Asp Ser Leu Ser Tyr Thr Thr Leu 

50 55 -60 

His Glu Asn Leu Pro Ser His Phe Leu Gly Gly Asn Ser Val Leu Asn 

65 70 75 80 

45 Ala Glu Pro Ser Lys Val Arg Asp Phe Val Arg Ala His Gin Gly His 

85 90 95 

Thr Val He Ser Lys He Leu He Ala Asn Asn Gly He Ala Ala Val 
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100 105 HO 

Lys Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr Phe Gly Asp 

115 120 125 

Glu Lys Ala He Gin Phe Thr Val Met Ala Thr Pro Glu Asp Leu Glu 

130 135 140 

Ala Asn Ala Glu Tyr He Arg Met Ala Asp Gin Phe He Glu Val Pro 
145 150 155 160 

Gly Gly Thr Asn Asn Asn Asn Tyr Ala Asn Val Asp Leu He Val Glu 

165 170 175 

He Ala Glu Ser Thr Asn Ala His Ala Val Trp Ala Gly Trp Gly His 

180 185 190 

Ala Ser Glu Asn Pro Leu Leu Pro Glu Lys Leu Ala Ala Ser Pro Lys 

195 200 205 

Lys He He Phe He Gly Pro Pro Gly Ser Ala Met Arg Ser Leu Gly 

210 215 220 

Asp Lys He Ser Ser Thr He Val Ala Gin His Ala Gin Val Pro Cys 
225 230 235 240 

He Pro Trp Ser Gly Thr Gly Val Asp Glu Val Lys He Asp Pro Gin 

245 250 255 

Thr Asn Leu Val Ser Val Ala Asp Asp He Tyr Ala Lys Gly Cys Cys 

260 265 270 

Thr Ser Pro Glu Asp Gly Leu Glu Lys Ala Lys Lys He Gly Phe Pro 

275 280 285 

Val Met He Lys Ala Ser Glu Gly Gly Gly Gly Lys Gly He Arg Lys 

290 295 300 

Val Asp Asp Glu Lys Asn Phe He Thr Leu Tyr Asn Gin Ala Ala Asn 
305 310 315 320 

Glu He Pro Gly Ser Pro He Phe He Met Lys Leu Ala Gly Asp Ala 

325 330 335 

Arg His Leu Glu Val Gin Leu Leu Ala Asp Gin Tyr Gly Thr Asn He 

340 345 350 

Ser Leu Phe Gly Arg Asp Cys Ser Val Gin Arg Arg His Gin Lys He 

355 360 365 

He Glu Glu Ala Pro Val Thr He Ala Arg Lys Glu Thr Phe His Glu 

370 375 380 

Met Glu Asn Ala Ala Val Arg Leu Gly Lys Leu Val Gly Tyr Val Ser 
385 390 395 400 

Ala Gly Thr Val Glu Tyr Leu Tyr Ser His Ala Glu Asp Lys Phe Tyr 

405 410 415 

Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Thr Thr Glu 

420 425 430 

Met Val Thr Gly Val Asn Leu Pro Ala Ala Gin Leu Gin He Ala Met 

435 440 445 

Gly He Pro Met His Arg He Arg Asp He Arg Thr Leu Tyr Gly Ala 

450 455 460 

Asp Pro His Thr Thr Thr Asp He Asp Phe Glu Phe Lys Ser Glu Thr 
465 470 475 480 
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Ser Leu Val Ser Gin Arg Arg Pro Thr Pro Lys Gly His Cys Thr Ala 

485 490 495 

Cys Arg lie Thr Ser Glu Asp Pro Gly Glu Gly Phe Lys Pro Ser Gly 

500 505 510 

Gly Ser Leu His Glu Leu Asn Phe Arg Ser Ser Ser Asn Val Trp Gly 

515 520 525 

Tyr Phe Ser Val Gly Asn Gin Ser Ser He His Ser Phe Ser Asp Ser 

530 535 540 

Gin Phe Gly His He Phe Ala Phe Gly Glu Asn Arg Gin Ala Ser Arg 
545 550 555 560 

Lys His Met Val Val Ala Leu Lys Glu Leu Ser He Arg Gly Asp Phe 

565 570 575 

Arg Thr Thr Val Glu Tyr Leu He Lys Leu Leu Glu Thr Pro Asp Phe 

580 585 590 

Glu Asp Asn Thr He Thr Thr Gly Trp Leu Asp Glu Leu He Thr Lys 

595 600 605 

Lys Leu Thr Ala Glu Arg Pro Asp Pro He Val Ala Val Val Cys Gly 

610 615 620 

Ala Val Thr Lys Ala His He Gin Ala Glu Glu Glu Lys Lys Glu Tyr 
625 630 635 640 

He Gin Ser Leu Glu Lys Gly Gin Val Pro His Arg Asn Leu Leu Lys 

645 650 655 

Thr He Phe Pro Val Glu Phe He Tyr Glu Gly Glu Arg Tyr Lys Phe 

660 665 670 

Thr Ala Thr Lys Ser Ser Glu Asp Lys Tyr Thr Leu Phe Leu Asn Gly 

675 680 685 

Ser Arg Cys Val Val Gly Ala Arg Ser Leu Ser Asp Gly Gly Leu Leu 

690 695 700 

Cys Ala Leu Asp Gly Lys Ser His Ser Val Tyr Trp Lys Glu Glu Ala 
705 710 715 720 

Ser Ala Thr Arg Leu Ser Val Asp Gly Lys Thr Cys Leu Leu Glu Val 

725 730 735 

Glu Asn Asp Pro Thr Gin Leu Arg Thr Pro Ser Pro Gly Lys Leu Val 

740 745 750 

Lys Tyr Leu Val Asp Ser Gly Glu His Val Asp Ala Gly Gin Pro Tyr 

755 760 765 

Ala Glu Val Glu Val Met Lys Met Cys Met Pro Leu He Ala Gin Glu 

770 775 780 

Asn Gly Val Val Gin Leu He Lys Gin Pro Gly Ser Thr Val Asn Ala 
785 790 795 800 

Gly Asp He Leu Ala He Leu Ala Leu Asp Asp Pro Ser Lys Val Lys 

805 810 615 

His Ala Lys Pro Phe Glu Gly Thr Leu Pro Ser Met Gly Glu Pro Asn 

620 825 830 

Val Thr Gly Thr Lys Pro Ala His Lys Phe Asn His Cys Ala Gly He 

835 840 845 

Leu Lys Asn He Leu Ala Gly Tyr Asp Asn Gin Val He Leu Asn Ser 
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850 855 860 

Thr Leu Lys Ser Leu Gly Glu Val Leu Lys Asp Asn Glu Leu Pro Tyr 
8 fi5 870 875 880 

Ser Glu Trp Gin Gin Gin He Ser Ala Leu His Ser Arg Leu Pro Pro 

885 890 895 

Lys Leu Asp Asp Gly Leu Thr Ala Leu Val Glu Arg Thr Gin Ser Arg 

900 905 910 

Gly Ala Glu Phe Pro Ala Arg Gin He Leu Lys Leu He Thr Lys Ser 

915 920 925 

He Ala Glu Asn Gly Asn Asp Met Leu Glu Asp Val Val Ala Pro Leu 

930 935 940 

Val Ser He Ala Thr Ser Tyr Gin Asn Gly Leu Val Glu His Glu Tyr 
945 950 955 960 

Asp Tyr Phe Ala Ser Leu He Asn Glu Tyr Tyr Asp Val Glu Ser Leu 

965 970 975 

Phe Ser Gly Glu Asn Val Arg Glu Asp Asn Val He Leu Lys Leu Arg 

980 985 990 

Asp Glu Asn Lys Ser Asp Leu Lys Lys Val He Gly He Gly Leu Ser 

995 1000 1005 

His Ser Arg Val Ser Ala Lys Asn Asn Leu He Leu Ala He Leu Asp 

1010 1015 1020 

He Tyr Glu Pro Leu Leu Gin Ser Asn Ser Ser Val Ala Ala Ser He 
1025 1030 1035 1040 

Arg Glu Ala Leu Lys Asn Leu Phe He Arg Pro Arg Ala Cys Ala Lys 

1045 1050 1055 

Val Ala Leu Lys Ala Arg Glu He Leu He Gin Cys Ser Leu Pro Ser 

1060 1065 1070 

He Lys Glu Arg Ser Asp Gin Leu Glu His He Leu Arg Ser Ser Val 

1075 1080 1085 

Val Gin Thr Ser Tyr Gly Glu He Phe Ala Lys His Arg Glu Pro Asn 

1090 1095 HOO 

Leu Glu He He Arg Glu Val Val Asp Ser Lys His He Val Phe Asp 
1105 1110 1115 1120 

Val Leu Ala Gin Phe Leu He Asn Pro Asp Pro Trp Val Ala He Ala 

1125 H30 1135 

Ala Ala Glu Val Tyr Val Arg Arg Ser Tyr Arg Ala Tyr Asp Leu Gly 

1140 H45 H50 

Lys He Glu Tyr His Val Asn Asp Arg Leu Pro He Val Glu Trp Lys 

1155 H60 H65 

Phe Lys Leu Ala Asn Met Gly Ala Ala Gly Val Asn Asp Ala Gin Gin 

1170 H75 H80 

Ala Ala Ala Ala Gly Gly Asp Asp Ser Thr Ser Met Lys His Ala Ala 
H85 1190 1195 1200 

Ser Val Ser Asp Leu Thr Phe Val Val Asp Ser Lys Thr Glu His Ser 

1205 1210 1215 

Thr Arg Thr Gly Val Leu Ala Pro Ala Arg His Leu Asp Asp Val Asp 
1220 1225 1230 
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Glu Thr Leu Thr Ala Ala Leu Glu Gin Phe Gin Pro Ala Asp Ala lie 

1235 1240 1245 

Ser Phe Lys Ala Lys Gly Glu Thr Pro Glu Leu Leu Asn Val Leu Asn 

1250 1255 1260 

lie Val lie Thr Ser He Asp Gly Tyr Ser Asp Glu Asn Glu Tyr Leu 
1265 1270 1275 1280 

Ser Arg He Asn Glu He Leu Cys Glu Tyr Lys Glu Glu Leu He Ser 

1285 1290 1295 

Ala Gly Val Arg Arg Val Thr Phe Val Phe Ala His Gin He Gly Gin 

1300 1305 1310 

Tyr Pro Lys Tyr Tyr Thr Phe Thr Gly Pro Asp Tyr Glu Glu Asn Lys 

1315 1320 1325 

Val lie Arg His He Glu Pro Ala Leu Ala Phe Gin Leu Glu Leu Gly 

1330 1335 1340 

Arg Leu Ala Asn Phe Asp He Lys Pro He Phe Thr Asn Asn Arg Asn 
1345 1350 1355 1360 

He His Val Tyr Asp Ala lie Gly Lys Asn Ala Pro Ser Asp Lys Arg 

1365 1370 1375 

Phe Phe Thr Arg Gly lie lie Arg Thr Gly Val Leu Lys Glu Asp lie 

1380 1385 1390 

Ser lie Ser Glu Tyr Leu lie Ala Glu Ser Asn Arg Leu Met Asn Asp 

1395 1400 1405 

He Leu Asp Thr Leu Glu Val lie Asp Thr Ser Asn Ser Asp Leu Asn 

1410 1415 1420 

His lie Phe lie Asn Phe Ser Asn Ala Phe Asn Val Gin Ala Ser Asp 
1425 1430 1435 1440 

Val Glu Ala Ala Phe Gly Ser Phe Leu Glu Arg Phe Gly Arg Arg Leu 

1445 1450 1455 

Trp Arg Leu Arg Val Thr Gly Ala Glu lie Arg lie Val Cys Thr Asp 

1460 1465 1470 

Pro Gin Gly Thr Ser Phe Pro Leu Arg Ala lie lie Asn Asn Val Ser 

1475 1480 1485 

Gly Tyr Val Val Lys Ser Glu Leu Tyr Leu Glu Val Lys Asn Pro Lys 

1490 1495 1500 

Gly Glu Trp Val Phe Lys Ser lie Gly His Pro Gly Ser Met His Leu 
1505 1510 1515 1520 

Arg Pro lie Ser Thr Pro Tyr Pro Val Lys Glu Ser Leu Gin Pro Lys 

1525 1530 1535 

Arg Tyr Lys Ala His Asn Met Gly Thr Thr Tyr Val Tyr Asp Phe Pro 

1540 1545 1550 

Glu Leu Phe Arg Gin Ala Thr He Ser Gin Trp Lys Lys Tyr Gly Lys 

1555 1560 1565 

Lys Val Pro Lys Asp Val Phe Val Ser Leu Glu Leu He Thr Asp Glu 

1570 1575 1580 

Thr Asp Ser Leu He Ala Val Glu Arg Asp Pro Gly Ala Asn Lys He 
15 85 1590 1595 1600 

Gly Met Val Gly Phe Lys Val Thr Ala Lys Thr Pro Glu Tyr Pro His 
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1605 1610 1615 

Gly Arg Gin Leu lie lie Val Ala Asn Asp He Thr His Lys He Gly 

1620 1625 1630 

Ser Phe Gly Pro Glu Glu Asp Asn Tyr Phe Asn Lys Cys Thr Glu Leu 

1635 1640 1645 

Ala Arg Lys Leu Gly He Pro Arg He Tyr Leu Ser Ala Asn Ser Gly 

1650 1655 1660 

Ala Arg He Gly Val Ala Glu Glu Leu He Pro Leu Tyr Gin Val Ala 
1665 1670 1675 1680 

Trp Asn Glu Glu Gly Ser Pro Asp Lys Gly Phe Arg Tyr Leu Tyr Leu 

1685 1690 1695 

Ser Thr Ala Ala Lys Glu Ser Leu Glu Lys Asp Gly Lys Ser Asp Ser 

1700 1705 1710 

Val Val Thr Glu Arg He Val Glu Lys Gly Glu Glu Arg His Val lie 

1715 1720 1725 

Lys Ala He He Gly Ala Glu Asp Gly Leu Gly Val Glu Cys Leu Lys 

1730 1735 1740 

Gly Ser Gly Leu He Ala Gly Ala Thr Ser Arg Ala Tyr Lys Asp He 
1745 1750 1755 1760 

Phe Thr He Thr Leu Val Thr Cys Arg Ser Val Gly He Gly Ala Tyr 

1765 1770 1775 

Leu Val Arg Leu Gly Gin Arg Ala He Gin He Asp Gly Gin Pro He 

1780 1785 1790 

He Leu Thr Gly Ala Pro Ala He Asn Lys Leu Leu Gly Arg Glu Val 

1795 1800 1805 

Tyr Ser Ser Asn Leu Gin Leu Gly Gly Thr Gin He Met Tyr Asn Asn 

1810 1815 1820 

Gly Val Ser His Leu Thr Ala Asn Asp Asp Leu Ala Gly Val Glu Lys 
1825 1830 1835 1840 

He Met Glu Trp Leu Ser Tyr Val Pro Ala Lys Arg Gly Leu Pro Val 

1845 1850 1855 

Pro He Leu Glu Ser Glu Asp Ser Trp Asp Arg Asp Val Asp Tyr Tyr 

I860 1865 1870 

Pro Pro Lys Gin Glu Ala Phe Asp Val Arg Trp Met He Gin Gly Arg 

1875 1880 1885 

Glu Val Asp Gly Glu Tyr Glu Ser Gly Leu Phe Asp Lys Asp Ser Phe 

1890 1895 1900 

Gin Glu Thr Leu Ser Gly Trp Ala Lys Gly Val Val Val Gly Arg Ala 
1905 1910 1915 1920 

Arg Leu Gly Gly He Pro He Gly Val He Gly Val Glu Thr Arg Thr 

1925 1930 1935 

Val Glu Asn Leu He Pro Ala Asp Pro Ala Asn Pro Asp Ser Thr Glu 

1940 1945 1950 

Ser Leu He Gin Glu Ala Gly Gin Val Trp Tyr Pro Asn Ser Ala Phe 

1955 I960 1965 

Lys Thr Ala Gin Ala He Asn Asp Phe Asn Asn Gly Glu Gin Leu Pro 
1970 1975 1980 
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Leu Met lie Leu Ala Asn Trp Arg Gly Phe Ser Gly Gly Gin Arg Asp 
1985 1990 1995 2000 

Met Tyr Asn Glu Val Leu Lys Tyr Gly Ser Phe He Val Asp Ala Leu 

2005 2010 2015 

.Val Asp Phe Lys Gin Pro He Phe Thr Tyr He Pro Pro Asn Gly Glu 

2020 2025 2030 

Leu Arg Gly Gly Ser Trp Val Val Val Asp Pro Thr He Asn Ser Asp 

2035 2040 2045 

Met Met Glu Met Tyr Ala Asp Val Asp Ser Arg Ala Gly Val Leu Glu 

2050 2055 2060 

Pro Glu Gly Met Val Gly He Lys Tyr Arg Arg Asp Lys Leu Leu Ala 
2065 2070 2075 2080 

Thr Met Glu Arg Leu Asp Pro Thr Tyr Gly Glu Met Lys Ala Lys Leu 

2085 2090 2095 

Asn Asp Ser Ser Leu Ser Pro Glu Glu His Ser Lys He Ser Ala Lys 

2100 2105 2110 

Leu Phe Ala Arg Glu Lys Ala Leu Leu Pro He Tyr Ala Gin He Ser 

2115 2120 2125 

Val Gin Phe Ala Asp Leu His Asp Arg Ser Gly Arg Met Leu Ala Lys 

2130 2135 2140 

Gly Val He Arg Lys Glu He Lys Trp Thr Asp Ala Arg Arg Phe Phe 
2145 2150 2155 2160 

Phe Trp Arg Leu Arg Arg Arg Leu Asn Glu Glu Tyr Val Leu Arg Leu 

2165 2170 2175 

He Ser Glu Gin He Lys Asp Ser Ser Lys Leu Glu Arg Val Ala Arg 

2180 2185 2190 

Leu Lys Ser Trp Met Pro Thr Val Glu Tyr Asp Asp Asp Gin Ala Val 

2195 2200 2205 

Ser Asn Trp He Glu Glu Asn His Ala Lys Leu Gin Lys Arg Val Asn 

2210 2215 2220 

Glu Leu Lys Gin Glu Val Ser Arg Thr Lys He Met Arg Leu Leu Lys 
2225 2230 2235 2240 

Glu Asp Pro Asn Ser Ala He Ser Ala Met Lys Asp Tyr Val Glu Arg 

2245 2250 2255 

Leu Ser Lys Glu Asp Lys Glu Lys Phe Leu Lys Ala Leu Lys 
2260 2265 2270 
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